Using Data Provenance to Measure Information Assurance Attributes

نویسندگان

  • Abha Moitra
  • Bruce Barnett
  • Andrew W. Crapo
  • Stephen J. Dil
چکیده

Data Provenance is multi-dimensional metadata that specifies Information Assurance attributes like Confidentiality, Authenticity, Integrity, Non-Repudiation etc. It may also include ownership, processing details and other attributes. Further, each Information Assurance attribute may itself have sub-components like objective and subjective values or application security versus transport security. Traditionally, the Information Assurance attributes have been specified probabilistically as a belief value (or corresponding disbelief value) in that Information Assurance attribute. In this paper we introduce a framework based on Subjective Logic that incorporates uncertainty by representing values as a triple of . This framework also allows us to work with conflicting Information Assurance attribute values that may arise from multiple views of an object. We also introduce a formal semantic model for specifying and reasoning over Information assurance properties in a workflow. Data Provenance information can grow substantially as the amount of information kept for each object increases as well as the complexity of a workflow increases. In such situations, it may be necessary to summarize the Data Provenance information. Further, the summarization may depend on the Information Assurance attributes as well as the type of analysis used for Data Provenance. We show how such summarization can be done and how it can be used to generate trust value in the data. We also discuss how the Information Assurance values can be visualized. . Introduction Our primary interest is in calculating the assurance in data used. One of the components used to calculate this is the Information Assurance (IA) communication attributes, which includes attributes of confidentiality, integrity, authenticity, nonrepudiation, and availability. Factors that impact this include opinions of the data sources and of the certificate authorities used during the authentication process. These values are based on the observer’s viewpoint, loyalties, and knowledge, and are therefore highly subjective. For simplicity we will not address these factors in this paper. Instead, we will focus on the information assurance attributes of the communication itself, related to the communication channel and process. If all parties agree on the relative strength of cryptographic algorithms at a certain point in time, then this forms the basis for an objective and consistent measurement of information assurance values across multiple parties regarding a set of messages. In this paper we describe a model of information flow based on simple and complex messages (messages with attachments) about which objective information assurance attribute values are collected. This model includes the capability to “rollup” data provenance information over a complex message and/or over a multi-step information flow. We call these aggregations "Figures of Merit". Given objective information assurance attribute values for a message or a figure of merit, our next goal is to summarize these in a simple visual icon that allows those who must act on information quickly to understand how confidential, authentic, and unmodified the data is, therefore allowing them to make more educated choices when dealing with the data. Previous Work In our previous work[1], we developed a generalized and flexible framework that was independent of any implementation, yet allowed a series of data provenance records to be captured, and analyzed. We summarize the framework below. Each time a message was moved between agents, systems or processes, a single Data Provenance (DP) record is created. This record might be stored or send along in parallel with the message. During the analysis, all of the records related to a single message are assumed to be available. Each DP record has two parts: one from the sender and one from the receiver. Each part has an invariant and a variant section. The variant section may contain routing information to forward the message to the final destination, and may change during the routine process. The invariant part remains unchanged, allowing cryptographic hashes of this section to be consistent. The sender’s invariant section may include the following components. Identity of the Author of the message

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Quality Assurance of Continuous Data

Most real-world databases contain some amount of inaccurate data. Reliability of critical attributes can be evaluated from the values of other attributes in the same data table. This paper presents a new fuzzy-based measure of data reliability in continuous attributes. We partition the relational schema of a database into a subset of input (predicting) and a subset of target (dependent) attribu...

متن کامل

Leveraging the Open Provenance Model as a Multi-tier Model for Global Climate Research

Abstract— Global climate researchers rely upon many forms of sensor data and analytical methods to help profile subtle changes in climate conditions. The U.S. Department of Energy’s Atmospheric Radiation Measurement (ARM) program provides researchers with a collection of curated Value Added Products (VAPs) resulting from continuous sensor data streams, data fusion, and modeling. The ARM operati...

متن کامل

Using Provenance to Extract Semantic File Attributes

Rich, semantically descriptive file attributes are valuable in many contexts, such as semantic namespaces and desktop search. Descriptive attributes help users to find files placed in seemingly-arbitrary locations by different applications. However, extracting semantic attributes from file contents is nontrivial. An alternative is to examine file provenance: how and when files are used, and the...

متن کامل

A Geometric View of Similarity Measures in Data Mining

The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...

متن کامل

Aggregation by Provenance Types: A Technique for Summarising Provenance Graphs

As users become confronted with a deluge of provenance data, dedicated techniques are required to make sense of this kind of information. We present Aggregation by Provenance Types, a provenance graph analysis that is capable of generating provenance graph summaries. It proceeds by converting provenance paths up to some length k to attributes, referred to as provenance types, and by grouping no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010